161 research outputs found

    Universal Learning of Repeated Matrix Games

    Full text link
    We study and compare the learning dynamics of two universal learning algorithms, one based on Bayesian learning and the other on prediction with expert advice. Both approaches have strong asymptotic performance guarantees. When confronted with the task of finding good long-term strategies in repeated 2x2 matrix games, they behave quite differently.Comment: 16 LaTeX pages, 8 eps figure

    Asymptotics of Discrete MDL for Online Prediction

    Get PDF
    Minimum Description Length (MDL) is an important principle for induction and prediction, with strong relations to optimal Bayesian learning. This paper deals with learning non-i.i.d. processes by means of two-part MDL, where the underlying model class is countable. We consider the online learning framework, i.e. observations come in one by one, and the predictor is allowed to update his state of mind after each time step. We identify two ways of predicting by MDL for this setup, namely a static} and a dynamic one. (A third variant, hybrid MDL, will turn out inferior.) We will prove that under the only assumption that the data is generated by a distribution contained in the model class, the MDL predictions converge to the true values almost surely. This is accomplished by proving finite bounds on the quadratic, the Hellinger, and the Kullback-Leibler loss of the MDL learner, which are however exponentially worse than for Bayesian prediction. We demonstrate that these bounds are sharp, even for model classes containing only Bernoulli distributions. We show how these bounds imply regret bounds for arbitrary loss functions. Our results apply to a wide range of setups, namely sequence prediction, pattern classification, regression, and universal induction in the sense of Algorithmic Information Theory among others.Comment: 34 page

    Adaptive Online Prediction by Following the Perturbed Leader

    Full text link
    When applying aggregating strategies to Prediction with Expert Advice, the learning rate must be adaptively tuned. The natural choice of sqrt(complexity/current loss) renders the analysis of Weighted Majority derivatives quite complicated. In particular, for arbitrary weights there have been no results proven so far. The analysis of the alternative "Follow the Perturbed Leader" (FPL) algorithm from Kalai & Vempala (2003) (based on Hannan's algorithm) is easier. We derive loss bounds for adaptive learning rate and both finite expert classes with uniform weights and countable expert classes with arbitrary weights. For the former setup, our loss bounds match the best known results so far, while for the latter our results are new.Comment: 25 page

    Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments

    Get PDF
    AbstractThe nonstochastic multi-armed bandit problem, first studied by Auer, Cesa-Bianchi, Freund, and Schapire in 1995, is a game of repeatedly choosing one decision from a set of decisions (“experts”), under partial observation: In each round t, only the cost of the decision played is observable. A regret minimization algorithm plays this game while achieving sublinear regret relative to each decision. It is known that an adversary controlling the costs of the decisions can force the player a regret growing as t12 in the time t. In this work, we propose the first algorithm for a countably infinite set of decisions, that achieves a regret upper bounded by O(t12+ε), i.e. arbitrarily close to optimal order. To this aim, we build on the “follow the perturbed leader” principle, which dates back to work by Hannan in 1957. Our results hold against an adaptive adversary, for both the expected and high probability regret of the learner w.r.t. each decision. In the second part of the paper, we consider reactive problem settings, that is, situations where the learner’s decisions impact on the future behaviour of the adversary, and a strong strategy can draw benefit from well chosen past actions. We present a variant of our regret minimization algorithm which has still regret of order at most t12+ε relative to such strong strategies, and even sublinear regret not exceeding O(t45) w.r.t. the hypothetical (without external interference) performance of a strong strategy. We show how to combine the regret minimizer with a universal class of experts, given by the countable set of programs on some fixed universal Turing machine. This defines a universal learner with sublinear regret relative to any computable strategy

    Master algorithms for active experts problems based on increasing loss values

    No full text
    We specify an experts algorithm with the following characteristics: (a) it uses only feedback from the actions actually chosen (bandit setup), (b) it can be applied with countably infinite expert classes, and (c) it copes with losses that may grow in time appropriately slowly. We prove loss bounds against an adaptive adversary. From this, we obtain master algorithms for ``active experts problems'', which means that the master's actions may influence the behavior of the adversary. Our algorithm can significantly outperform standard experts algorithms on such problems. Finally, we combine it with a universal expert class. This results in a (computationally infeasible) universal master algorithm which performs - in a certain sense - almost as well as any computable strategy, for any online problem

    Quantitative real-time imaging of intracellular FRET biosensor dynamics using rapid multi-beam confocal FLIM

    Get PDF
    Fluorescence lifetime imaging (FLIM) is a quantitative, intensity-independent microscopical method for measurement of diverse biochemical and physical properties in cell biology. It is a highly effective method for measurements of Förster resonance energy transfer (FRET), and for quantification of protein-protein interactions in cells. Time-domain FLIM-FRET measurements of these dynamic interactions are particularly challenging, since the technique requires excellent photon statistics to derive experimental parameters from the complex decay kinetics often observed from fluorophores in living cells. Here we present a new time-domain multi-confocal FLIM instrument with an array of 64 visible beamlets to achieve parallelised excitation and detection with average excitation powers of ~ 1–2 μW per beamlet. We exemplify this instrument with up to 0.5 frames per second time-lapse FLIM measurements of cAMP levels using an Epac-based fluorescent biosensor in live HeLa cells with nanometer spatial and picosecond temporal resolution. We demonstrate the use of time-dependent phasor plots to determine parameterisation for multi-exponential decay fitting to monitor the fractional contribution of the activated conformation of the biosensor. Our parallelised confocal approach avoids having to compromise on speed, noise, accuracy in lifetime measurements and provides powerful means to quantify biochemical dynamics in living cells
    corecore